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FOREWORD 


This  report  is  intended  to  document  the  state  of  development 
of  Converter  Group,  Print- To- Digital  AN/GSA-29  at  the  completion 
of  Task  14  of  Contract  AF  30(602)-2080.  The  primary  objectives  of 
the  work  under  this  task  was  the  evaluation  of  the  line  following  sys- 
tem  and  masking  technique. 

The  entire  work  assignment  was  performed  at  the  Thomas  J. 
Watson  Research  Center  and  was  carried  out  in  the  Expe rimental 
Systems  Research  Department. 
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ABSTRACT 


The  Optical  Russian  Print  Reader  (Converter  Group, 
Print- To- Digital  AN/GSA-29)  has  been  assembled,  the  front 
end  optics  aligned,  and  the  line  following  servo  system  analyzed 
with  the  assistance  of  Baird-Atomic  personnel.  An  IBM  7090 
simulation  shows  that  the  basic  masking  technique  used  for  an 
idealized  electro-optical  system  yields  adequate  discrimination 
levels  only  for  very  high  quality  characters  and  for  very  close 
tolerances  on  text  registration.  The  report  contains  a  detailed 
description  of  the  servo  analysis  and  masking  technique  simula¬ 
tion;  it  also  includes  error  rate  tabulations  based  on  input  text 
quality  and  proposed  mask  alterations. 
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ENGINEERING  ANALYSIS  AND  DIGITAL  SIMULATION 
OF  THE  OPTICAL  RUSSIAN  PRINT  READER 


INTRODUCTION 

An  analysis  of  the  Une  following  senro  tyiten  nnd  aneking  technique  used  in  the  Optical 
Russinn  Print  Render  (Converter  Group,  Print-To- Digital  AN/GSA-29)  hat  been  made.  The  line 
following  servo  system  was  statically  tested  and  its  operation  also  simulated.  The  naskiag 
technique  was  simulated  using  an  IBM  7090  coa pater  with  digitalised  characters.  The  simulation 
was  also  used  to  investigate  the  effect  of  aaak  alterations  proposed  to  decrease  the  ays  tea  sen¬ 
sitivity  to  character  misalignment.  The  overall  siaulation  results  have  been  used  to  predict  a 
lower  limit  on  the  error  rate  and  to  relate  this  error  rate  to  character  area  changes,  misregistra¬ 
tion,  and  tome  system  noise.  * 

SUMMARY; 

1.  The  design  and  action  of  the  line  follower  appears  to  be  adequate  for  "good”  text.* 

2.  With  excellent  quality  characters  properly  registered  on  original  documents  the  character 
discrimination  would  be  sufficient  so  that  it  should  be  within  the  state  of  the  art  to  design  an 
optical  system  and  an  analog  computer  to  differentiate  between  characters  with  a  high  degree  of 
reliability.  However,  the  major  limitation  of  this  character  recognition  technique  is  the  quality  of 
the  source  document;  this  quality  being  indiented  by  character  area  stability,  shape  stability  and 
individual  vertical  registration. 

BACKGROUND  . 

The  equipment  (Converter  Group,  Print-To- Digital  AN/GSA-29)  was  assembled  by  Baird- 
Atomic  personnel  and  the  froot  end  optics  aligned  to  permit  operation  of  the  Une  follower  (Fine  # 
Positioning  Servo  System).  ( 

An  IBM  7090  computer  simulation  was  used  rather  than  the  partially  completed  system  for 
analyzing  the  character  masking  technique  for  the  following  reasons: 

1.  The  quality  of  the  text  could  be  controlled  as  to  area  dropout  or  addition  and  vertical 
registration. 

2.  The  system  variables  could  be  bypassed  such  as: 

(a)  Image  distortion  horn  main  optical  path.  9 

(b)  Photomultiplier  tube  (PMT)  and  amplifier  drift, 

e  (c)  Alignment  of  112  lens  array  for  given  mask. 

(d)  Mechanical  vibrations. 

3.  The  proposed  vertical  mask  expansion  (slurring)  to  reduce  vertical  sensitivity  could  be 

easily  siatulnted.  a 

The  use  of  the  computer  simulation  also  predicts  the  upper  bound  on  descrimination  levels 
for  the  tested  Type  foot  At  and  allows  a  prediction  of  the  system  tolerances  and  text  limitation. 


**good*  test  messing: 

"Test  with  input  chsrscters  which  deviate  from  their  respective  ms  ska  by  less  than  a  few  per  center  in 
total  area  imprinted  aad  general  shape,  aad  which  arc  aligned  from  character  to  character  to  within  3%  of 
their  reference  position.” 

*Boni,  C.  et.  al.  "Russian  Type  Study”,  Technical  Note  1,  New  York  University,  Division  of  General 
Education,  Sponsored  fay  RADC,  Contract  AF  30(602)  -  1824,  November  13,  1938. 


CHARACTER  DIGITALIZATION 


The  use  of  an  IBM  7090  computer  for  the  simulation  of  the  character  recognition  masking 
technique  necessitated  converting  the  character  shapes  into  a  digitalized  format.  The  binary 
coding  0-1  was  used  to  signify  by  0  the  absence,  and  by  1  the  occurrence  of  character  area  at  a 
fixed  location.  The  location  size  was  chosen  to  be  a  square  0.00209  inch  on  a  side  when  refer* 
enced  to  the  original  document.  The  narrowest  stroke  occurring  in  die  Type  font  A  is  approxi¬ 
mately  0.004  inch  wide  and  would  be  represented  by  two  binary  bits.  The  binary  bit  size  of 
0.00209  inch  at  the  original  document  corresponds  to  a  resolution  of  9.5  line  pairs  per  millimeter 
which  is  about  two  times  better  than  the  optical  resolution  in  the  system  as  it  is  now  and  about 
equal  to  a  reasonable  design  resolution. 

The  character  digitalization  was  accomplished  by  obtaining  96  x  blowups  of  the  Russian 
characters  from  Type  font  A  (Figure  1).  The  original  photographs  of  the  characters  were  prepared 
by  Baird-Atomic  from  many  samples  and  were  used  in  preparing  the  type  font  masks.* 

The  96  x  character  photographs  to  be  digitalized  were  overlaid  with  a  72  x  72  transparent 
grid  with  the  established  base  reference  line  above  bit  position  number  22  of  the  even  numbered 
words  as  shown  in  die  above  figure.  Each  character  waa  also  positioned  so  that  some  character 
area  was  shown  as  information  in  either  Vord  #1  or  Word  #2.  Each  word  is  36  bits  long  and  the 
maximum  field  for  any  character  is  144  words.  The  card  punching  format  was  as  follows: 

0  Columns:  1-3  Decimal  Character  Number 

4-6  Blank 


7-18  Octal  Data  (second  word) 
19  -  30  Octal  Data  (first  word) 

31  -  77  Blank 

78  -  80  Decimal  Card  Number 


The  resolution  of  this  grid  size  is  shown  in  Figure  3. 

Since  the  card  punching  was  in  an  octal  basis,  the  computer  program  was  written  to  convert 
the  data  to  binary  form  for  analysis.  A  binary  printout  of  the  digitalized  characters  has  been  made 
and  is  shown  in  Parts  1  thru  3  of  the  data  printout  book.*  The  stars  represent  where  the  charac¬ 
ter  area  filltd  the  reference  square  by  more  than  50%;  that  is,  each  star  represents  a  0.00209  inch 
square  of  character  area  (See  Page  22).  The  star  printout  appears  vertically  compressed  due  to 
the  difference  between  line  spacing  and  character  spacing  in  the  IBM  717  printer.  The  printouts 
of  Parts  1  thru  3  also  list  the  area  of  the  character  as  "total  number  of  bits  in  grid",Nrith  each 
bit  again  being  a  0.00209  inch  square  at  die  original  document.  a 

Definitions:  * 

Match  -  The  value  2P  -  (P  +  N)*  normalized  by  die  true  area  of  the  reference  character. 

Auto-correlation  -  The  match  value  when  the  indicated  input  character  and  reference 
character  are  identical.  * 

Cross-correlation  -  The  mutch  value  when  the  indicated  input  character  occurs  in  the  listed 
reference  character  channel. 


a 


•It  should  be  noted  that  the  reference  to  "perfect*  characters  and  "perfect*  mask  to  be  need  herein  refers  to 
the  Baird-Atomic  photographs.  The  variation  of  aise  or  shape  in  any  given  character  has  been  eliminated 
by  using  the  saase  digitalised  characters  as  both  the  input  and  the  reference  masks. 

tfhe  printout  of  the  digitalised  characters  represent  Parts  1  thru  3  of  ths  data  printout  book,  with  Parts  4 
thru  •  the  simulation  ruaa.  The  dam  book  is  1287  pages  long,  and  is  being  retained  at  the  IBM  Research 
Center.  Examples  of  the  type  of  data  obtained  have  been  included  and  arc  shown  on  Pages  (22  thru  45.) 

*See  Page  9. 
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Figure  1.  Typo  Font 'A. 


LINE  FOLLOWER 

The  purpose  of  the  line  follower  is  to  establish  a  base  line  for  proper  vertical  alignment  of 
the  characters  on  the  type  font  masks.  The  recognition  system,  as  designed,  is  critically  sensi¬ 
tive  to  input  vertical  alignment  on  the  positive  character  masks.  The  vertical  sensitivity  to  align¬ 
ment  was  shown  by  the  IBM  7090  character  simulation  as  an  inability  for  the  recognition  technique 
to  maintain  discrimination  between  similar  characters.  It  has  been  found  that  a  vertical  displace¬ 
ment  of  ±0.00209  inch  referenced  back  to  the  original  document  is  troublesome.  This  vertical  dis¬ 
placement  of  ±0.00209  inch  represents  about  3%  of  the  height  of  a  lower  <  tit  -  'ter  and  about 
2%  of  the  height  of  an  uppet  case  character.  This  therefore  puts  the  stress  on  line  following  servo 
to  maintain  adequate  discrimination  levels. 

The  line  following  servo,  as  presently  designed  by  Baird-Atomic,  establishes  an  average 
base  line  for  14  character  spaces.  The  servo  system  must  be  capable  of  film  positioning  to  within 
±0.00094  inch  to  maintain  a  ±0.00209  inch  tolerance  referenced  back  at  the  document. 

The  decision  time  for  character  recognition  is  less  than  23  microseconds  meaning  that  the  line 
servo  does  not  have  time  to  position  each  character  individually  during  the  decision  period  (400 
cps  servo  with  160  cps  reported  bond-width  implies  a  6.23  milliseconds  response  time  for  error 
detection  and  correction). 

The  mis-alignment  of  any  given  character  in  any  line  would  only  be  partially  corrected  by  the 
averaging  of  the  line  follower. 

The  given  film  samples  represent  a  reduction  of  a  3  inch  wide  column  of  printed  material  in 
Type  font  A.  The  3  inch  column  represents  the  maximum  allowable  width  and  has  approximately 
66  character  spaces  per  line.  The  11  lines  per  second  designed  operation  gives  a  document 
scanning  speed  of  88.3  inches  per  second  (after  correcting  of  dwell,  flyback  and  idle  tunes).  This 
therefore  represents  23.4  ssicroseconds  per  0.00209  inch  (23*4  microseconds  per  bit). 

The  character  which  is  to  be  recognized  baa  to  move  9.8  character  spaces  from  the  center  of 
the  line  follower  window  to  the  recognition  position.  The  tsovement  takes  8.2  milliseconds  during 
which  the  servo  may  only  possibly  make  one  correction. 

The  decision  time  of  23  microseconds  is  determined  from  the  IBM  7090  simulation  where  the 
auto-correction  function  reaches  its  normalized  peak  of  1.00  and,  in  all  cases,  only  remains  at 
this  peak  for  one  bit  space- or  0.00209  inch  on  the  original  document. 
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Upper  -  8  x  blowup*  Lowor  -  actual  tin 

Loll  -  actual  character  Right  -  digitalised  character 

Figure  3.  Character  #17,  Actual  and  Digitalised. 
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As  an  example,  it  can  be  clearly  seen  that  the  threshold  firing  value  for  the  H  recognition 
channel  must  be  above  0.83  as  this  is  the  maximum  cross -correlation  normalized  match  value 
reached  by  input  character  #56  H  (See  Page  26).  This  threshold  value  is  not  uniquely  determined 
for  all  characters,  but  is  dependent  on  the  highest  cross -correlation  obtained  for  each  reference 
character. 

It  is  advantageous  to  have  the  threshold  value  as  close  as  possible  to  the  highest  cross¬ 
correlation  in  order  to  obtain  as  much  insensitivity  to  vertical  displacement  as  possible. 

Thus,  in  the  cited  case,  the  auto-correlation  for  character  #51  drops  to  0.77  when  the  input  is 
shifted  ±  1  bit  vertically;  that  is,  0.00209  inch  vertically  displaced  from  the  reference  line  estab¬ 
lished  by  the  line  servo  when  referenced  back  at  the  original  document.  O 

The  auto-correlation  then  drops  to  0.57  for  a  ±2  bit  vertical  shift.  These  latter  values  are 
below  die  cross-correlation  obtained  from  character  #56  in  #51  channel.  This  means  that  it  is 
necessary  to  at  least  maintain  better  than  ±1  bit  vertical  alignment  for  recognition  of  this 
character. 

The  Fine  Positioning  Servo  System  (Line  Follower)  analysis  was  accomplished  by  a  static 
operational  test  of  the  system  and  by  also  simulating  the  operation  by  use  of  the  digitalized  * 
characters. 

*The  experimental  analysis  was  limited  in  usefulness  due  to  the  uncertainty  of  measured  re¬ 
sults.  This  uncertainty  in  test  results  was  due  to  the  following: 

1.  The  image  quality  at  the  measuring  reticle  was  very  poor  due  to  the  system  optics. 

2.  The  light  level  was  relatively  low,  which  made  the  taking  of  measurements  a  very 

fatiguing  job.  ® 

3.  The  film  used  for  the  test  was  of  good  quality,  but  the  typographical  errors  of  misalign¬ 
ment  ofthe  original  text  was  an  uncontrolled  parameter. 

4.  The  positioning  servo  was  operated  statically  and  no  measure  of  its  dynamic  operation 
was  possible. 


By  assuming  a  normal  distribution  for  the  data  obtained  on  69  lower  case  letters,  it  is  esti¬ 
mated  that  approximately  ten  per  cent  of  all  characters  will  fall  outside  the  range  <  *  plus  or 
minus  0.002  inch  from  their  mean  position.  Approximately  one  per  cent  of  the  characters  will  fall 
outside  the  range  of  plus  or  minus  0.003  inch  from  their  mean  position. 

It  is  felt  that  this  measure  of  character  misregistration  is  mainly  due  to  the  original  Jocumcnt 
variations. 

The  line  follower  action  is  as  follows.  A  narrow  slit  approximatelj^4  character  spaces  in 
length  is  divided  into  two  equal  parts  optically  (A  and  B  below).  Each  area  is  examined  for  total 


light  level  in  t^n  by  the  same  photo-multiplier  tube  (PMT)  through  a  400  cycle s-per-second  rotat- 
•  ing  optical  chopper  assembly.  The  characters  are  clear  areas  on  the  film,  (white  on  black)  so 
that  each  character  is  represented  by  a  lighted  area  on  a  dark  background.  The  light  level  indues 
A  is  reduced  by  a  factor  of  0.5  by  means  of  a  neutral  density  filter  and  sensed  by  the  PMT.  •The 
light  level  in  area  B  is  then  sensed  by  the  same  PMT.  Th«s  sampling  is  synchronous  with  a  400 
cps  reference  source,  so  that  each  sample  area  is  shown  relative  to  the  reference  timing.  The  ^ 
difference  in  PMT  voltage  levels  (if  present)  is  sensed  and  used  as  the  feedback  signal  (error 
detection)  for  the  film  positioning.  The  film  is  continuously  positioned^ until  the  PMT  signals 
are  equal.  % 
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Ad  example  of  thia  is  shown  by  using  a  lighted  bar  (C)  and  the  two  areas  A  and  B. 


A 

B 


The  PMT  output  signals  will  be  examined.  Vith  equal  total  illuminati^p  on  A  and  B  and  keep¬ 
ing  in  mind  the  0.5  light  reduction  of  the  signal  in  A,  the  PMT  signal  may  be  normalized  to 
VA  -  0.5  and  V„  -  1.0.  • 


TJT-TLr . 


9  vB-i.o 


Va-0.5 


PMT  Output 
at  400  cps 


Time 


0  *  •  • 

As  the  film  (with  bar  C)  enters  the  gate,  the  otherwise  n on-illuminated  areas  A  and  B  receive 
light.  • 


The  illumination  of  the  slits  first  occurs  in  A?  and  the  size  of  C  will  be  assumed  to  cover  20% 
of  A.  The  VA  output  signal  will  therefore  be  0.1  (0.5  x  0.2),  while  VB  remains  at  0  due  to  its  non¬ 
illumination.  The  signal  difference  is  detected  and  used  to  advance  die  servoing  of  the  film  (and 
area  C)  until  the  position  shown  below  is  reached. 


i 


The  are*  of  C  in  A  ia  still  20%,  giving  a  PMT  signal  of  0.1  and  the  area  of  C  in  B  ia  10* 
giving  a  Vj  PMT  signal  of  0.1  also.  The  error  detection  and  correction  signal  is  0  and  the  servo 
holds  the  film  stationary. 

The  descent  of  C  causes  VB  to  exceed  VA  and  the  servo  to  raise  the  film,  and  the  ascent  of 
C  causes  VA  to  exceed  VB  and  lower  the  film.  The  complete  servo  system  may  be  block  dia¬ 
gramed  as  (slow,  where  it  is  noted  to  have  two  feedback  loops;  one  for  mechanical  position 
detection  and  the  one  just  described  for  error  detection  and  correction. 
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Figure  5.  Serve  System  Block  Diagram. 


The  gate  in  the  feedback  loop  is  used  to  inhibit  the  fine  positioning  signal  and  to  advance 
the  film  to  the  next  line. 

The  regular  occurrence  of  misregistration  below  a  base  line  of  several  characters  has  been 
noticed  by  Baird-Atomic  and  the  maaks  shifted  accordingly.  This  was  taken  into  account  in  the 
character  digitalization. 

A  sample  of  one  line  of  text  was  chosen  at  random  and  the  operation  of  the  line  follower  simu¬ 
lated.  The  sampling  mask  was  chosen  as  0.01254  inches  high  when  referenced  back  to  the  docu¬ 
ment  which  corresponds  to  6  bin  ia  the  IBM  7090  simulation.  The  bottom  edge  of  each  character 
was  divided  into  ten  slices,  each  ooe  bit  wide,  and  these  slice  areas  computed  for  each  character 
in  the  ample  line.  * 


5  slices 
5  slices 
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The  action  of  the  line  follower  was  simulated  by  summing  the  respective  slice  areas  for  14  con* 
secutive  character  spaces  and  assuming  that  the  area  was  evenly  distributed  in  each  slice. 

Each  alice  (0.00209  inch  on  die  document)  was  broken  into  10  segments,  each  therefore  about 
0.2  thousands  of  an  inch.  It  was  found  that  the  addition  of  a  cyrillic  letter  such  as  P  with  the 
large  descending  bar  lowers  the  average  line  as  established  by  the  line  follower  by  one  segment, 
or  about  0.0002  inch  at  the  document.  The  improbable  occurrence  of  a  string  of  14  characters 
with  descenders  could  cause  the  line  follower  to  descend  0.0028  inch.  'The  average  base  line 
may  therefore  be  easily  seen  to  remain  within  a  ±1  bit  (±0.00209  inch)  tolerance.  An  exception 
to  the  above  tolerance  limits  occurs  at  the  start  and  finish  of  a  line,  as  the  line  follower  is  using 
less  than  14  character  spaces  to  establish  the  average  line. 

This  simulation  has  ignored  the  logarithmic  response  of  the  PMT  in  order  to  keep  the  analysis 
tractable.  This  logarithmic  response  increases  the  sensitivity  of  the  servoing. 

SIMULATION  PROGRAM 

The  Baird-Atomic  character  recognition  scheme  uses  the  identity? 


where: 


P  -  N  -  2P  -  (P  +  N) 

•  • 


P  is  the  measure  of  match  betwftn  the  input  character  and  the  positive  reference  mask  (1  for 
perfect  match).  ® 

N  is  the  measure  of  match  between  die  input  character  and  a  negative  reference  mask  (0  for 
perfect  match).  9 

(P  +  N)  is  proportions  1  to  the  area  of  the  character  and  is  measured  by  a  cleaP field  of  view 
aperature.  0 

A  perfect  match  therms  a  channel  output  of  P-N«l-0«lor2P-(P  +  N) -  2-1-1 
when  normalized  for  that  channel.  £ 

The  following  tabulation  lisa  the  Field-Of-View  numbers  ^heir  width  in  bits  (columns),  and 
the  reference  characters  associated  with  each  mask.  (See  Table  1). 

Input  and  output  for  the  IBM  7090  simulation  program  are  accomplished  through  the  medium  of 
magnetic  apes.  Two  input  apes  are  required,  one  consisting  of  digialized  reference  masks  and 
the  other,  digialized  input  characters.  For  a  given  reference-input  combination  the  input  charac¬ 
ter  is  superimposed  upon  the  reference  mask  at  a  number  of  space-time  positions  specified  by 
control  parameters.  For  each  overlay,  the  area  coaunoa  to  the  input  character  and  reference  mask 
is  compiled,  as  well  as  the  area  common  to  the  input  character  (at  that  particular  space-time  posi¬ 
tion)  and  the  fixed  fi^l  mask,  whose  absolute  spatial  position  is  defined  relative  to  the  reference 
mask.  A  figure  of  match  is  then  compiled  and  stored.  When  all  specified  superpositions  have 
been  processed,  the  complete  daa  for  this  input -re  fere  nee  combination  is  recorded  oa  an  output 
ape  (for  future  processing,  if  desired)  and  is  simultaneously  selectively  edited  and  recorded  on 
another  output  ape,  which  then  constitutes  the  primary  results  of  the  computation.  The  data  as¬ 
sociated  with  the  superposition  yielding  maxismim  match  is  stored. 

Five  modes  of  operation  of  the  simulation  are  provided  (and  selected  by  control  panuseters) 
to  allow  for  two  objectives:  (1)  the  specification  of  desired  input-reference  combinations;  (2)  the 
thorough  investigation  of  the  actual  operation  of  the  hardware.  One  may  specify  computation  far 
a  fixed  set  of  space-time  superpositions  or  operate  under  various  and  selectable  degrees  of  dis¬ 
crete  approximation  to  the  continuous  passage  of  input  characters  over  reference  and  fixed  field 
masks.  When  all  specified  input  and  reference  combinations  have  been  processed,  fee  daa  associ¬ 
ated  wife  the  maximum  match  for  each  combination  is  edited,  sorted  and  tabulated  oa  the  output 


TABLE  I. 

Mask  Size*  and  Associated  Characters 


Field-Of-View 
Aperture  No. 

Vidth 

Reference  Characters 

1 

13 

78,  80,  82,  85 

2 

19 

83,  84 

3 

25 

75 

4 

29  • 

79,  86 

•  5 

31 

1,  46,  50 

6 

33 

6,  48,  72 

7 

35 

2,  3,  4,  7,  45,  60,  64,  76,  87,  88 

S 

37 

5,  8,  9,  10,  43,  44,  52,  53,  54, 

• 

61,  66,  74 

9 

39 

18,  47,  51,  58,  62 

10 

43 

14,  27,  57,  59,  65 

11 

43 

12,  13,  29,  39,  40,  4L  56,  63,  69 

12 

45 

16,  55  • 

13 

47  ^ 

21,  28,  32,  34 

14 

51  • 

11,  22,  25^0,  70 

15 

53 

26,  31,  71,  73 

16 

53 

20,  24,  49,  77  # 

17 

55 

19,  33 

18 

59 

15,  37,  67,  68 

19  • 

63 

23,  38 

20 

69 

21 

71 

41,  89 

22 

73 

17,  35,  36,  81 

tape  in  two  ways:  (1)  by  order  of  reference  character;  and  (2)  by  descending  magnitude  of  the 
figure  of  match. 

I  Taro  objectives  were  given  prime  consideration  in  the  design  of  the  program:  (1)  speed  of 
computation.  Approximately  230,000  double  superpositions  (input  on  reference  and  input  on  fixed 
field  mask)  are  accomplished  in  twenty  minutes.  Twenty  more  minutes  are  raptured  for  output  and 
editing  of  the  resulting  data.  (2)  flexibility.  Assuming  compatibility  with  the  physical  limitations 
of  tape  and  storage,  the  program  was  designed  to  allow  for  the  following  possibilities.  Input 
routines  are  easily  alterable  to  accommodate  any  type  or  number  of  characters  or  masks.  TV  < 
algorithm  which  determines  the  area  common  to  input  and  reference  characters  is  easily  alterable 
to  accommodate  any  size  of  character  matrix.  In  addition,  this  section  may  easily  be  expanded  to 
include  one  of  several  schemes  to  simulate  the  effects  of  noise  (vibration)  on  mask  positioning. 
The  computational  algorithm  is  easily  alterable  to  admit  the  determination  of  any  prescribed  index 
or  measure  of  match.  The  editing  portion  of  the  program  may  be  easily  enlarged  or  may  be  elimi¬ 
nated  completely,  relegating  this  function  to  auxiliary  program  designed  for  less  expensive 
machines. 


AUXILIARY  PROGRAMS 

A  library  of  auxiliary  programs  was  developed  to  provide  input  for  and  service  output  of  the  £ 
simulation  program.  One  program  develops,  from  punched  octal  cards  in  a  single  pass,  binary 
tapes  constituting  perfect  input  characters  and  reference  masks,  elogated  and  shifted  reference 
masks  (which  constitute  input  rapes  for  the  simulation),  as  well  as  a  punch  tape,  which  may  be 

% 


used  off-line  to  provide  punched  card  decks  corresponding  to  these  output  tapes.  Other  programs 
were  developed  to  process  selectively  the  intermediate  binary  data  output  of  the  simulation,  pre¬ 
paring  them  for  processing  by  existing  large  scale  data  sort,  merge  and  editing  programs. 


DATA  ARRANGEMENT 

The  printouts  in  Parts  1  thru  3  of  the  Data  Printout  Book*  represent  the  character  digitaliza¬ 
tion.  Part  1  is  the  unaltered  characters  with  each  star  representing  an  area  0.00209  inch  square 
when  referenced  back  to  the  original  document.  The  reference  line  of  each  character  falls  on  line 
68  of  the  printouts,  which  is  noted  by  a  star.  The  width  of  each  character  is  shown  by  the  column 
listing  and  the  area  of  each  character  is  shown  by  the  column  listing  and  the  are^of  each  charac¬ 
ter  is  listed  as  the  'Total  No.  of  Bits  in  Grid*. 

Parts  2  and  3  represent  the  character  masks  with  respectively  a  ±1  and  ±2  bit  vertical  expan¬ 
sion.  The  character  area  change  is  shown  and  the  vertical  expansion  is  noted  by  the  character 
numbering.  For  example,  #051  is  unaltered  character  #51;  #151  is  character  #51  with  a  ±1  bit 
vertical  expansion;  and,  #251  is  character  #51  with  a  ±2  bit  vertical  expansion. 

Parts  4  thru  8  are  the  simulation  runs,  with  the  page  headings  as  follows: 

Vertical  (  )  —  This  is  the  vertical  position  in  bits  of  the  input  character  base  line  relative 

to  the  reference  character  base  line. 

Page  (  )  —  Page  number  in  ascending  order  for  the  data  in  each  part. 

The  column  headings  are  as  follows,  reading  from  right  to  left: 


REF 

INP 

DEL 

NORM 

L 

MASK 

P 

UN  MAT 
MATCH 


reference  character  number 
input  character  number 

delta  change  in  raster  from  row  to  row  of  data  (1 . —  1  bit) 

normalizing  area  of  reference  character 

line  number  position  of  right  hand  column  of  input  character  on  reference 
character  for  which  each  row  of  data  is  listed 

bit  count  (representing  area  measure)  of  input  character  occurring  in  mask 
associated  with  reference  character  (P  +  N) 

count  of  common  bits  between  input  and  reference  character  (measure  of 
common  area) 

the  match  value  in  bits  2P  -  (P  +  N) 

the  match  value  normalized  by  the  bit  count  (area)  of  die  reference  character 


In  order  to  simplify  the  data  printout,  only  match  values  above  0.5  were  printed.  The  maximum 
value  for  each  reference-input  combination  was  flagged  by  placing  a  star  next  to  the  reference 
character  number  in  the  row  in  which  it  occurs.  When  the  crosscorrelation  values  are  less  than  0.5 
for  all  positions,  die  maximum  is  printed  as  a  single  entry  for  that  particular  input-reference 
combination. 

The  last  four  pages  of  each  printout  of  Parts  4  thru  8  contain  the  maximum  match  value  for  each 
input-reference  combination  ordered  sequentially  by  reference  character  number  and  also  by  de¬ 
creasing  match  value.  These  four  summary  sheets  for  each  part  have  been  included  at  the  end  of 
this  report  (Pages  26  thru  45). 

The  fold-out  charts  on  Pages  12  thru  14  represent  a  tabulation  from  the  summary  sheets  of 
Pages  26  thru  45.  The  definitions  of  the  last  four  column  headings  are: 


E  >  F  Cross  correlation  of  perfect  on  perfect  at  0  vertical-greater  than— autocorrelation  of 
perfect  on  perfect  at  +1  vertical 

E  >  J  Crosscorrelation  of  perfect  on  perfect  at  0  vertical -greater  than -autocorrelation  of 
perfect  on  perfect  at  +2  vertical 


'Sample  pages  from  the  Data  Printout  Book  have  been  included  and  are  Page  22  for  Parts  1  thru  3  and  Pages 
23  thru  45  for  Parts  4  thru  8. 


*1  Expanded  Maak  42  Expanded  Mask 


&>- 

W**> 

UXm 

a  A 

J3" 

I- 

3 


M 


*s  ** 

H 


O  2 


2** 
o  2 


3 

11 

a  u  * 


51 


!]• 

u  2 


ll 

> *r 
+ < 


u  a 


O  U 


H" 

!*« 


12 


M  M  M  M  M 


1*1  >Q 

00  *•  2 
o  0  0 


S  s 


W  sO  IM  O  a*  O 


^  e  «  S  vO  to 


o  o 

o  •» 


000000 


000  000000 


OO  M  *>  •  •*  s©  m 


M  m  <M  O'  A 


OOO 

K  «•  •  a 


OOO 

«  >•  o 


OOO 


o  •* 


NCCCOOO«£>£insOCCaO 

o'  o*  o"  o  o"  o  o"  o  o’  0  o*  o*  o'  o'  o’  o* 


o  o 
O  t- 

m  * 

N  «-» 

9.  O 


*  #  10  !•  ;r  2  a  l. 


f-  **>  ->  ON  ^ 


^  f*» 


OOOOOOOOO 


o’  o  o*  o*  o’  o  ©  o’  o*  o*  o’  o’  0*  o’  o'  o  o* 


OOO 

2  t-  o 

;  2  5 

a  ?  s 


OOO 


K  K  *  -  2  a 


^  *>  O  «£ 


^  2  a  [-  =I3^'OXx23'>>^’2xt-OC_,-. 


«n  eo  in 


r-  m  uc.  i*i 


»n  o  1** 

f-  U"!  *■ 

o*  o*  o* 


0000 


id  in  >0 
o’  o*  o’ 


0000000 


*o  «  a  a  10  a 

o’  o’  o  o’  o'  o*  o’ 


*  5 

O  W* 


0000000000 


OOO 


oooooooooooo 


S  £ 

o’  o* 


^  r,  v©  -  >0  9 


~  2  *>  x  3  3  *ux*)<H>>^Sxt.ot.o 


r»i*«4ao~4N'Ar»'CM'6^r»N9‘N«<*ii">t*e'-o*<'to«'iio*<n«o 

«r-r^h-r~f^n-n-r-«or~r^r-oof-r~r-r>-«on-r-ob««o«o»«r>-®r^ 

OOOOOOOOOOOOOOOOOOOOOOOO*  OOOOOO 

ooN*4«n^^)r»r«>Aoo>^aor-<^NN(M*4Op4iApaM^^Qa0wi 
o*  o*  o'  o’  o’  o  o’  o’  o’  0  o’  0  o  o’  0  o'  o'  o'  o'  d  o  o  0  o  o’  o’  O  O  O  O 

S  t.  O 

m  *0  o  «a  o  *>aor<-*iM®**wo^f»®oin^«*>*'^**i 

<»«vi^W>(<>is>*wKa*oWft.Ot<» 

o  •-Nf^^'*ft'«r»®a*otAt«n4'»n*4>ftO0jO 


« 


M  M  M  M  M  M 


■  »  A 

,u2 

I  ■  #! 

3  S 

a-  i 

MlS 
Jf  y  O 

1  i' 

3  s  •  & 

S  X  5 


<MsCva^>scao<Jvff>OfloaO'Ooor3rof^aO'-<kr*,vO*'^,r-ir^«^‘«*s 

©  o*  o*  ©  o  ©  ©  o  d  o  o  d  d  o  o  o  o  o  o  d  o  d  o  o  d  d  o  o 


*K3fc3fc3'32.D**io*<  •  •(  S  ■■  H  *  J  *  *  H 


sO  — <  it*  in 


vO  f  N  (M 


O  O  O  O  O 


oooooooo 


^3ni^3  3aafin»i<  |\...  ^1.. 


oir»m<Mr*-<B>caoo-M 


ooooooooooooooooooooooo 


Ki.3o3a33ii**„>,mto  |  | 


2  y  o.  fi 


o  o  o  o  o  o 


ooooooooooo 


o  o  o  o  o 


>  _  '*OfON©*>or^>0'*in*f*>i7>o*im*'fn*o<*)i*tf**r4Ooo 

doddddddd  do  ode  dod  dodo  dodo  dodo 


^«.ax3'33’3«j^i»Vi^  | 


3  2  _  B.  fi 


r»  «n  aAr-  aor--^*HO'Ort>op«a‘»nrM 


o  o  o  o  o 


ooooooooo 


o  o  o  o  o  o  o 


p-  r*  ©  « 

ifl  O  t  N 


dddddddddddddddddo  doddddddd 


fi  |  ^  .  ..  3  ..  2  8  - 


&  o  I 


0kQt*'$p»  »n  oo  in  m  od  o  n 


10  46  2 


P  >  1  Crosscorrelation  of  perfect  on  ±1  vertically  expanded  mask  at  0  vertical-greater 
than— one 

S  >  1  Croascocrelation  of  perfect  on  ±2  vertically  expanded  maak  at  0  vertical-greater 
then-one 

The  eenaitivity  of  the  autocorrelation  to  vertical  alignment  was  anticipated  by  Baird-Atomic 
and  a  possible  solution  proposed.  The  proposed  solution  was  to  expand  the  masks  in  the  vertical 
direction  so  that  the  input  character  would  fall  totally  within  the  maak  even  if  misregistered  an 
amount  equal  to  or  less  than  the  mask  expansion.  The  vertical  mask  expansion  gives  a  propor¬ 
tional  decrease  in  sensitivity  to  vertical  position,  but  also  gives  rise  to  an  increase  in  the  cross¬ 
correlation  maximum  (See  Figure  7). 

Figures  6  and  7  below  indicate  the  changes  in  the  autocorrelation  and  crosscorrelation  values 
due  to  vertical  misregistration  and  vertical  mask  slurring. 

Figure  6  shows  that  the  median  autocorrelation  drops  from  a  normalized  ■»«»!■■'■  of  1.0  when 
correctly  positioned  to  a  value  of  0.76  when  displaced  vertically  by  one  bit  (0.00209  inch  at  the 
document).  This  autocorrelation  median  drops  to  0.56  for  a  two  bit  vertical  shift. 

Figure  7  shows  the  range  of  crosscorrelation  maxim  urns  for  perfect  masking  and  for  vertically 
slurred  masks.  The  increased  vertical  insensitivity  obtained  by  vertically  expanding  (slurring) 
the  mask  is  offset  by  an  increase  in  the  crosscorrelation  values,  that  is,  a  loss  in  discrimination. 

The  match  values  obtained  for  a  given  reference  character  are  normalized  by  the  area  of  that 
reference  character.  This  sets  1.0  as  a  perfect  match  and  references  all  other  match  values  to 
this.  The  actual  channel  voltage  signals  are  dependent  on  the  light  source,  system  transmission, 
PMT  gain,  noise  levels,  etc.  The  relative  channel  signals  may  be  calculated  on  the  baais  of  the 
reference  character  area.  For  example,  die  smallest  character  is  a  period,  #79  (.)  and  the  largest 
is  a  capitol  letter,  #17  0)0*  Their  respective  areas  are  65  bits  and  1252  bits.  Therefore,  al¬ 
though  both  perfect  match  values  are  1.0,  the  voltage  levels  in  channel  #17  is  1252/65  -  19.3 
times  that  in  channel  #79. 

ERROR  RATES 

An  error  rate  table  was  constructed  by  using  the  autocorrelation  and  crosscorrelation  tabula¬ 
tion  for  perfect  characters  with  misalignment  and  on  expanded  (slurred)  masks. 

The  error  rate  is  given  in  Table  II  and  is  the  number  of  misstd  characters  per  1000  random 
input  characters.  The  system  noise  is  completely  ignored  and  an  error  is  defined  only  when  the 
crosscorrelation  becomes  equal  to  or  greater  than  the  autocorrelation  in  a  given  channel.  The 
area  dropout  or  addition  per  cent  is  a  direct  meaaure  of  the  reduced  or  increased  character  area 
as  seen  by  the  character  matrix. 

The  error  rate  has  been  tabulated  taking  into  account  die  character  frequencies  ss  given  in 
the  NYU  study.  (Reference,  Page  1.)  The  tabulation  of  the  upper  and  lower  case  character  fre¬ 
quencies  has  been  included  and  is  shown  on  Page  19- 

The  electronic  controls  for  the  character  channels  generate  an  inhibit  pulse  blocking  all  chan¬ 
nels  for  the  remainder  of  the  character  space  after  one  character  channel  has  triggered.  This 
technique  is  helpful  when  cross  correlation  peaks  are  close  to  or  greater  than  autocorrelation  peak 
values  but  occur  (in  time)  after  the  autocorrelation  peak  has  triggered  the  correct  channel.  When 
the  competitive  crosscorrelation  peak  would  ordinarily  occur  before  the  autocorrelation  peak,  a 
similar  blanking  effect  is  obtained  by  shifting  the  relative  positions  of  the  reference  apertures  so 
as  to  change  the  order  in  which  the  peaks  occur.  This  blanking  possibility  was  not  taken  into 
account  in  computing  the  error  rates  in  Table  II. 

When  this  blanking  is  incorporated  along  with  a  ±0.004  inch  expanded  mask,  as  much  as  7%  of 
the  area  of  a  character  can  be  deleted  with  a  vertical  misalignment  of  as  much  as  ±0.004  inch 
before  the  first  error  occurs. 

It  is  possible  to  calculate  another  error  rate  due  to  the  Gaussian  nature  of  the  photomultiplier 
tube  (PMT)  output.  This  was  done  for  channel  #51  representing  the  cyrillic  letter  W ,  The  shot 
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Figure  6.  Auto- Correlation  Variation*. 

noise  of  the  photomultiplier  tubes  was  assumed  to  hare  a  Gaussian  distribution  with  a  standard 
deviation  in  photo-electrons  equal  to  the  square  root  of  the  total  number  of  photo -electrons  occur¬ 
ring  per  sample  time.*  The  recognition  technique  is  such  that  the  signal  from  the  positive  mask 


•Smullin,  L.  D.  and  Haas,  H.  A.  (eds.):  'Noise  in  Election  Devices,*  Page  98,  John  Wiley  and  Sons,  Inc., 
New  York,  1999. 
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PMT  is  doubled  (2P)  sod  summed  with  the  inverted  signal  from  the  field  of  view  (FOV)  aperature 
PMT  t  —  (P  +  N) ].  The  standard  deviation  of  the  total  signal  is  then  equal  to 


am  ]/(2  trp)2  +  (a(p  +  N))2 

where 


(Tp  standard  deviation  of  mask  PMT  signal 

a.p  +  Nj  standard  deviation  of  FOV  PMT  signal 

The  threshold  level  for  firing  the  channel,  indicator  was  chosen  for  minimum  error  at  a  point 
between  the  maximum  crosscorrelation  (0.89)  and  the  autocorrelation  (1.0)  signal  where  die  stand¬ 
ard  deviation  of  the  two  signals  were  equal  (0.94).  The  highest  crosscorrelation  (0.89)  was  for  the 
perfect  character  #56  (H)  on  the  ±1  bit  vertically  expanded  (slurred)  mask  of  reference  character 
#51. 


TABLE  II. 

Error  Rates  (Lower  Bound) 


Area  Dropout 
or  Addition 
in  Per  Cent 

Perfect  on 
Perfect 

0  Vertical 

Perfect  on 
Perfect 
±1  Vertical 

Perfect  on 
±1  Mask 

0  Vertical 

_ 

Perfect  on 
±2  Mask 

0  Vertical 

Errors 

Char. 

Errors 

Char. 

Errors 

Char. 

Errors 

Char. 

0 

0 

0 

238 

10 

33 

2 

33 

2 

1 

0 

0 

etc. 

33 

2 

34 

3 

2 

0 

0 

33 

2 

34 

3 

3 

0 

0 

33 

2 

34 

3 

4 

0 

0 

33 

2 

34 

4 

5 

0 

0 

34 

3 

34 

4 

6 

0 

0 

34 

3 

34 

4 

7 

0 

0 

100 

4 

100 

5 

8 

0 

0 

100 

4 

104 

7 

9 

0 

0 

104 

5 

104 

7 

10 

0 

0 

107 

6 

195 

9 

11 

0 

0 

195 

7 

etc. 

12 

3 

1 

etc. 

13 

3 

1 

14 

3 

1 

15 

3 

2 

16 

3 

2 

17 

157 

4 

18 

169 

6 

etc. 

The  numbers  in  the  character  (char.)  column  refer  to  the  total  number  of  characters  contributing  to  the 
error  rate. 


Error  rates  above  15%  (150  per  1000)  are  not  extensively  tabulated. 
The  error  rate  listed  is  for  1000  random  input  characters. 
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The  error  rate  is  divided  into  two  parts: 

(1)  the  errors  which  occur  when  character  #51  appears  in  its  own  channel  and  is 
not  recognized,  and 

(2)  when  some  other  character  (as  #56  above)  appears  in  channel  #51  and  is 
recognized  incorrectly  as  character  #51 . 


TABLE  in. 
Character  Frequencies 


Character  No. 
(Upper  Case) 

Frequency 
in  Per  Cent 

_ 

Character  No. 
(Lower  Case) 

Frequency 
in  Per  Cent 

11 

0.09 

43 

7.21 

12 

0.05 

44 

1.55 

13 

0.18 

45 

4.56 

14 

0.03 

46 

1.31 

15 

0.03 

47 

2.72 

16 

0.04 

48 

9.55 

17 

0.01 

49 

0.80 

18 

0.04 

50 

1.66 

19 

0.08 

51 

8.77 

20 

0.01 

52 

0.92 

21 

0.09 

53 

3.30 

22 

0.04 

54 

4.50 

23 

0.09 

55 

3.21 

24 

0.20 

56 

6.59 

25 

0.09 

57 

10.15 

26 

0.11 

58 

2.37 

27 

0.07 

59 

5.21 

28 

0.26 

60 

5.31 

29 

0.09 

61 

6.56 

30 

0.03 

62 

2.50 

31 

0.02 

63 

0.21 

32 

0.02 

64 

1.02 

33 

0.00 

65 

0.31 

34 

0.05 

66 

1.31 

35 

0.01 

67 

0.80 

36 

0.00 

68 

0.40 

37 

0.00 

69 

0.01 

38 

0.00 

70 

2.42 

39 

0.00 

71 

1.18 

40 

0.07 

72 

0.27 

41 

0.00 

73 

0.48 

42 

0.01 

74 

1.06 

Frequency  of  Upper  Case  1.18%  Frequency  of  Lower  Case  98.82% 


The  first  type  of  error  -  correct  character  not  recognized  -  is  represented  by  the  area  of  the 
autocorrelation  probability  curve  shown  in  Figure  8  which  is  below  the  threshold  value  of  0.94. 
The  second  type  of  error  -  wrong  character  recognized  —  is  represented  by  the  area  of  the  cross- 
correlation  probability  curves  shown  in  Figure  7  which  ezceed  the  threshold  value  of  0.94.  The 
error  rate  contributions  are  shown  below  for  channel  #51  (M). 


PROBAOUTY 


23. 4  microseconds  itmph  tlm* 
6.7,10*  tlm. 


#  58  Input 


MATCH 

Figure  8.  Match  Valaao  for  Channel  #51. 


The  error  rate  per  1000  rondos  character*  for  each  input  is  obtained  by  multiplying  the  char* 
actor  frequency  of  occurrence  (See  Page  18)  by  the  error  rate  per  1000  repeat*  of  the  some  input 
character. 

The  total  error  rate  contributed  by  this  channel  is  therefore  2.61 +  errors  per  1000  random  input 
characters. 

This  character  channel  (#31)  has  the  highest  error  rate  of  all  characters,  but*it  is  easily  seen 
that  this  order  of  magnitude  error  rate  per  channel  is  entirely  excessive  when  there  ore  89  inde¬ 
pendent  channels  contributing. 

This  calculation  eras  based  on  the  light  levels  obtained  with  the  source  now  mounted  on  the 
machine  (30  ampere  tungsten  ribbon  filament)  and  on  a  sample  time  of  23.4  microseconds  as  an 
indicated  minimum  from  the  IBM  7090  simulation.  The  light  level  at  the  PMT’s  was  therefore  15*7 
photo-elec troas/bit/  sample  time  where  a  bit  again  refers  to  square  area  0.00209  inch  on  a  side 
referenced  back  to  the  original  document. 

This  calculation  eras  only  based  on  the  effects  of  shot  noise  in  the  PMT's  and  ignored  Johnson 
Noise,  Amplifier  Noise,  Vibration  Noise,  etc.  The  system  PMT  noise  level  is  now  fairly  high  bat 
a  simple  calculation  reveals  that  a  doubling*  of  the  light  level  received  at  the  PMT’s  decreases 
the  total  error  rate  for  Channel  #31  by  a  factor  of  8.4. 


*An  increase  of  the  light  level  received  at  the  PMT’s  may  be  accoa^lisbed  by  coating  the  main  lens  ns  to 
their  specifications  and  by  modifying  the  light  source. 


TABLE  IV. 

Shot  Noise  Error  Rates  For  Channel  #51 


Input  Character 

Correct  Character 

Not  Recognized 

Wrong  Character 
Recognized 

* 

M 

in  Per  Cent 

Errors  per 

Errors  per 

Errors  per 

Errors  per 

1000  Repeats 

1000  Random 

1000  Repeats 

1000  Random 

of  Input 

Characters 

of  Input 

Characters 

51 

8.77 

15.4 

1.35 

56 

H 

6.59 

- 

• 

15.4 

1.07 

58 

n 

2.37 

- 

- 

8.2 

0.19 

65 

4 

0.31 

- 

■ 

0+ 

0+ 

Sub  Totals 

1.35 

1.26+ 

CONCLUSIONS 

The  IBM  7090  simulation  of  an  idealized  electro-optical  system  indicates  that  for  very  high 
quality  characters  and  with  very  close  tolerances  on  text  registration,  the  basic  masking  technique 
yields  adequate  discrimination  between  characters.  In  addition,  an  analysis  of  the  line  following 
technique  indicates  the  basic  approach  will  position  excellent  quality  text  to  within  a  ±  .002  inch 
▼ertical  displacement  referred  to  original  text.  This  variation  is  the  result  of  the  random  occur¬ 
rence  of  input  characters  which  contain  descenders. 

The  tolerance  of  the  masking  technique  to  ▼ertical  misalignment  can  be  improved  with  the 
vertical  expansion  (slurring)  of  the  mask  character  as  proposed  by  Baird-Atomic.  However,  as  the 
amount  of  vertical  slutting  is  increased,  the  discrimination  between  characters  decreases  and  be¬ 
comes  more  sensitive  to  characters  malformed  by  area  loss  or  addition. 

For  example,  using  a  reasonably  slurred  mask,  (±0.004  inch  referred  to  original  document) 
character  area  losses  or  additions  in  excess  of  7%  will  seriously  affect  the  reading  reliability, 
even  when  the  input  characters  are  registered  vertically  within  ±0.004  inch  of  the  correct  position. 

The  computer  simulation  undertaken  under  this  task,  although  very  comprehensive,  is  not  con¬ 
clusive  for  the  following  reasons.  First,  it  dealt  with  only  a  aingle  type  font,  and  does  not  neces¬ 
sarily  illustrate  the  typical  character  discriminations  resulting  with  other  type  fonts.  Second, 
there  is  no  known  study  of  the  quality  of  Russian  technical  text  which  will  relate  the  teal  life 
input  Bom  technical  journals  of  interest  to  the  printing  quality  constraints  shown  necessary  by  the 
simulation.  And  thirdly,  it  has  not  been  established  that  it  is  technologically  feasible  to  construct 
a  practical  system  in  which  parameters  such  as  electrical  noise,  optical  resolution,  and  mechani¬ 
cal  vibrations  can  be  specified  and  held  within  adequate  tolerances  to  obtain  character  discrimi¬ 
nation  necessary  in  a  useful  system.  Feasibility  of  such  an  effort  can  only  be  determined  by  a 
comprehensive  smcfasnical,  electrical,  and  optical  analysis  of  any  proposed  system,  probably 
involving  additional  digital  simulation. 
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TOTAL  NO.  OF  BITS  IN  GRID  481 


SIMULATION  DATA 


vertical 

0 

PAGE 

106 

MATCH 

UNMAT 

P 

MASK 

L 

NORM 

DEL 

INP 

REF 

0.0977 

*7 

196 

3*5 

32 

*81 

1 

1 

51* 

0.10*0 

-50 

250 

550 

30 

*81 

1 

2 

51* 

0.0936 

-45 

0 

*5 

61 

*81 

1 

3 

51* 

0.0021 

1 

209 

*17 

18 

*81 

1 

* 

51* 

0.0665 

-32 

113 

258 

11 

*81 

1 

5 

51* 

0.00*2 

-2 

131 

26* 

50 

*81 

1 

6 

51* 

0.0582 

-28 

0 

28 

61 

*81 

1 

7 
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0.1518 

-73 

0 

73 

61 
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1 

8 
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0.0852 

-41 

1*0 

321 

11 
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1 
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12 

145 

278 
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1 

10 
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0.0333 

-16 

0 

16 

77 

*81 

1 

11 

51* 

0.0561 

-27 

0 

27 

70 

*81 

1 

12 

51* 

0.0*57 

-22 

0 

22 

72 

*81 

1 

13 

51* 

0.0270 

-13 

203 

*19 

52 

481 

1 

1* 

51* 

0.0665 

-32 

0 

32 

8* 

481 

1 

15 

51* 

0.0291 

-1* 

195 

*0* 

57 

481 

1 

16 

51* 

0.05*1 

-26 

0 

26 

101 

481 

1 

17 

51* 

0.1*55 

-70 

65 

200 

58 

481 

1 

18 

51* 

0.0187 

-9 

284 

577 

*8 

*81 

1 

19 

51* 

0.0457 

-22 

192 

*06 

15 

481 

1 

70 

51* 

0.0104 

-5 

195 

395 

61 

481 

1 

71 

51* 

0.0478 

23 

195 

367 

16 

481 

1 

22 

51* 

0.0312 

-15 

0 

15 

92 

481 

1 

23 

51* 

0.0665 

-32 

0 

32 

81 

481 

1 

2* 

51* 

0.141* 

-68 

117 

302 

67 

481 

1 

75 

51* 

0.0083 

* 

197 

390 

63 

481 

1 

76 

51* 

0.0*78 

-23 

0 

23 

71 

*81 

1 

27 

51* 

0.0852 

-*1 

133 

307 

62 

*81 

1 

28 
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0.07*8 
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0 

36 

72 
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1 

29 
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0 

13 
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1 

*0 
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6 
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82 
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1 

*1 
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1 

*7 
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*22 

31 

*81 

1 

*3 
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29 

160 

291 

51 

*81 

1 

** 
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0.*615 

222 

307 

392 

29 

*81 

1 

*5 

51* 

0.3680 

177 

228 

279 

26 

*81 

1 

*6 
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0.3139 

151 

207 

263 

16 

*81 

1 

*7 

51* 

0.2328 

112 

250 

388 

29 

*31 

1 

*8 

51* 

0.*532 

218 

381 

5** 

50 

*81 

1 

*9 
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0.2682 

129 

228 
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29 

*81 

1 

50 
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0.6923 

333 

*07 
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31 
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1 

51 

51 
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*81 
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32 
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33 
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32 
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PAGE 
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P 
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L 
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31 
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1 

52 

51 
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31 

* 

0*5114 
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336 
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32 

481 

1 

53 

51 
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32 

» 
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13 
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1 

54 
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14 

481 

1 

55 
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30 
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1 

56 

51 
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31 

0*6541 
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383 
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32 
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397 
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31 

« 
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76 
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50 
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1 

57 
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370 
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31 
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1 

58 
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3? 
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37  5 

493 

33 

0.7256 
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493 

32 
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0*1915 

92 

284 

476 

30 

481 

1 

59 

51* 

0*2121 

102 

202 

302 

29 

481 

1 

60 

51* 

0.2911 

140 

215 

290 

39 

481 

1 

61 

51* 

-0*0166 

—8 

62 

132 

12 
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1 
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192 
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94 

609 

39 

646 

641 

636 

1*0312 

299 

49 

426 

32 

322 

477 

432 

1*0141 

221 

17 

718 

44 

864 

788 

712 

0*9916 

214 

29 

910 

29 

371 

930 

489 

0*9988 

294 

98 

461 

3? 

493 

462 

431 

0,9349 

244 

70 

393 

31 

404 

383 

966 

0*9266 

220 

19 

912 

49 

.974 

9ft6 

838 

0*9189 

244 

67 

720 

*49 

707 

683 

699 

0*9133 

?51 

96 

481 

31 

461 

448 

435 

0*9044 

249 

68 

334 

36 

506 

494 

48? 

0,9026 

219 

12 

831 

40 

813 

776 

739 

0*8893 

274 

42 

107 

10 

148 

121 

94 

0,8783 

244 

61 

279 

21 

283 

269 

?45 

0.8781 

290 

43 

327 

28 

422 

334 

?86 

0.0746 

249 

67 

39? 

92 

484 

41? 

340 

0*8673 

212 

13 

813 

40 

831 

767 

703 

0*8647 

294 

69 

493 

36 

534 

480 

4?6 

ft*8641 

247 

68 

707 

93 

720 

665 

610 

0.8628 

203 

8 

911 

9ft 

696 

968 

44  ft 

0*8611 

201 

22 

345 

16 

383 

338 

293 

0,8493 

272 

30 

331 

?6 

327 

310 

,  293 

0,8348 

271 

69 

362 

28 

395 

348 

301 

0,8315 

224 

33 

994 

94 

1009 

916 

823 

0,8280 

240 

48 

302 

27 

388 

319 

250 

0,8278 

219 

24 

974 

5ft 

994 

900 

806 

ft,82T5 

224 

33 

812 

54 

994 

e30 

666 

0,8202 

2  92 

91 

388 

32 

481 

477 

473 

0,8044 

247 

94 

413 

3? 

420 

376 

332 

0,8099 

274 

96 

422 

31 

461 

399 

937 

ft.7986 

214 

13 

661 

41 

831 

674 

517 

0.7821 

297 

73 

?9ft 

■*ft 

393 

344 

305 

0*7821 

222 

19 

68? 

53 

805 

668 

531 

0,7786 

233 

24 

1039 

30 

994 

898 

802 

0.7719 

249 

38 

233 

13 

333 

?S7 

181 

0,7702 

229 

41 

788 

45 

626 

612 

598 

0,7589 

248 

60 

388 

26 

302 

298 

294 

0*7577 

206 

3 

499 

30 

509 

443 

377 

0,7595 

237 

12 

621 

94 

813 

715 

617 

0,7915 

210 

9 

366 

31 

993 

509 

423 

0,7473 

224 

29 

987 

48 

712 

373 

434 

0.7394 

294 

47 

420 

96 

411 

338 

305 

0,726? 

218 

13 

646 

36 

816 

641 

466 

0.7214 

239 

96 

123? 

7ft 

1213 

1049 

985 

0.7183 

219 

36 

828 

93 

832 

72? 

992 

0*7130 

243 

90 

422 

23 

327 

314 

301 

0,7133 

209 

10 

993 

90 

366 

494 

422 

0*7092 

208 

3 

696 

30 

911 

300 

489 

0.7076 

229 

14 

642 

48 

467 

450 

449 

0*6994 

227 

29 

790 

29 

999 

334 

509 

0*6973 

744 

49 

949 

90 

974 

380 

236 

0*6762 

294 

99 

1213 

66 

1232 

1024 

816 

0*6727 

746 

98 

997 

31 

493 

379 

265 

0.6673 

244 

97 

949 

30 

390 

389 

388 

0*6632 

2fi 

37 

349 

49 

390 

383 

976 

0*6608 

261 

44 

916 

34 

276 

243 

208 

0,6382 

44 


TAFlF 

OF  MAXIM*. 

VPRTTrH 

n 

PAPF  2 

R€F 

INP 

NORM 

L  INF 

MASK 

P 

UNMAT 

MATCH 

282 

78 

1*8 

9 

107 

IP? 

97 

0.693* 

20* 

19 

*90 

29 

319 

*20 

321 

0.6351 

203 

6 

509 

So 

*99 

*12 

325 

'■'.6385 

2*2 

17 

786 

68 

87«» 

690 

501 

"•637* 

233 

39 

1000 

79 

603 

603 

606 

0.6030 

270 

71 

603 

28 

762 

36? 

362 

0.4987 

263 

67 

778 

*3 

677 

570 

463 

0.3931 

2*1 

21 

1007 

*3 

718 

638 

698 

0.3938 

2*0 

18 

560 

39 

6*6 

48  8 

lift 

0.6897 

217 

21 

125? 

71 

718 

716 

71? 

1,6687 

23* 

36 

667 

96 

396 

385 

374 

0,6650 

2*9 

33 

755 

52 

*26 

426 

426 

0,36*2 

239 

38 

*76 

32 

*93 

378 

263 

0.5325 

286 

27 

321 

2* 

33V 

258 

177 

0.551* 

280 

15 

11* 

54 

80 

71 

6? 

0.«*3V 

232 

17 

7*8 

66 

971 

663 

191 

0,5185 

235 

31 

588 

*0 

4*1 

387 

297 

0,4987 

283 

3* 

316 

10 

222 

18? 

14? 

0,4*94 

28* 

77 

3*6 

77 

319 

237 

153 

0,4*80 

?n2 

9 

330 

’10 

396 

415 

574 

".4571 

288 

*9 

733 

46 

646 

i*7 

149 

0,4197 

281 

23 

108* 

6* 

1037 

742 

447 

0 . 4l ?4 

287 

60 

328 

30 

302 

?18 

174 

0.4086 

207 

73 

38* 

?6 

248 

200 

132 

0,3958 

223 

3C 

1037 

60 

680 

3*5 

410 

0.3944 

231 

1 

8*0 

37 

343 

1*1 

737 

0.7919 

211 

32 

665 

37 

588 

416 

244 

0.7669 

276 

89 

77 

63 

10? 

64 

?6 

C.3777 

277 

75 

78? 

3* 

248 

2*8 

2*8 

o,317l 

289 

76 

739 

16 

77 

77 

77  ' 

0.2973 

273 

H 

?*8 

66 

304 

187 

70 

0,9827 

230 

23 

68o 

r>8 

9«1 

338 

175 

0,5574 

262 

3* 

36* 

46 

??? 

143 

74  • 

0,2637 

779 

80 

63 

9 

11* 

63 

12 

0.18*6 

